Combining Data Sources
geomsdata.frames, regressions, etc.)Diversity of zooplankton (Diversity) prey in each of 5 replicate Blocks of three Treatment levels.1
# A tibble: 15 × 3
Treatment Diversity Block
<fct> <dbl> <fct>
1 Control 4.1 1
2 Low 2.2 1
3 High 1.3 1
4 Control 3.2 2
5 Low 2.4 2
6 High 2 2
7 Control 3 3
8 Low 1.5 3
9 High 1 3
10 Control 2.3 4
11 Low 1.3 4
12 High 1 4
13 Control 2.5 5
14 Low 2.6 5
15 High 1.6 5
ggplot adds geoms sequentially)outlier.shape = NA)SEM <- function(x) {
return(sd(x) / sqrt(length(x)))
}
ZP_means <- ZP |>
group_by(Treatment) |>
summarise(mean_Diversity = mean(Diversity),
upper_bound = mean_Diversity + 2 * SEM(Diversity),
lower_bound = mean_Diversity - 2 * SEM(Diversity),
.groups = "drop")
ZP_means# A tibble: 3 × 4
Treatment mean_Diversity upper_bound lower_bound
<fct> <dbl> <dbl> <dbl>
1 Low 2 2.51 1.49
2 Control 3.02 3.65 2.39
3 High 1.38 1.76 0.998
ggplot()data argument to geom_…ggplot() +
geom_point(data = ZP, aes(x = Treatment, y = Diversity),
size = 2,
position = position_jitter(width = 0.05, seed = 6734747)) +
geom_point(data = ZP_means,
aes(x = Treatment, y = mean_Diversity),
size = 4,
color = "gray50") +
geom_errorbar(data = ZP_means,
aes(x = Treatment, ymin = lower_bound, ymax = upper_bound),
width = 0.1, size = 1,
color = "gray50")Antibody levels in Red-winged blackbirds before and after experimental manipulation of testosterone1
ID Before After
1 1 105 85
2 2 50 74
3 3 136 145
4 4 90 86
5 5 122 148
6 6 132 148
7 7 131 150
8 8 119 142
9 9 145 151
10 10 130 113
11 11 116 118
12 12 110 99
13 13 138 150
BB_long <- BB |>
pivot_longer(cols = -ID,
names_to = "Timepoint",
values_to = "Antibody_level")
BB_long# A tibble: 26 × 3
ID Timepoint Antibody_level
<fct> <chr> <int>
1 1 Before 105
2 1 After 85
3 2 Before 50
4 2 After 74
5 3 Before 136
6 3 After 145
7 4 Before 90
8 4 After 86
9 5 Before 122
10 5 After 148
# … with 16 more rows
group = ID tells geom_line() to associate lines with IDs.Also many options in the forcats package (fct_ functions) when you already have a factor.
stat_summary() can add the output of summary functions.mean_cl_boot give the mean and bootstrapped 95% CI. (requires Hmisc)aes(group = -1) ungroups the data so that the statistics will calculate for each Timepointab linesab linesab lines: refinedab lines: refinedgeom_smooth()
lm() to GLM and GAMpredict() or another helper function
Bird species richness in different habitat patches sampled in Jamaica.1
Patch_ID n_Species Landscape_type Area log_Area
1 ag1a 24 Agriculture 3.5101825 0.54532969
2 ag1b 15 Agriculture 0.6155155 -0.21076101
3 ag1c 25 Agriculture 2.2350471 0.34928667
4 ag1d 35 Agriculture 8.2798820 0.91802415
5 ag2a 32 Agriculture 1.3736535 0.13787720
6 ag2b 40 Agriculture 59.2797938 1.77290668
7 ag2c 27 Agriculture 0.5580768 -0.25330600
8 ag2d 37 Agriculture 4.3000000 0.63346846
9 ag3a 40 Agriculture 19.6362412 1.29305836
10 ag3b 36 Agriculture 5.4911675 0.73966469
11 ag3c 27 Agriculture 1.1923333 0.07639768
12 ag3d 33 Agriculture 3.0919172 0.49022785
13 ag3e 25 Agriculture 1.9279442 0.28509446
14 ag3f 17 Agriculture 0.9881100 -0.00519472
15 ag4a 12 Agriculture 0.9714663 -0.01257228
16 ag4b 37 Agriculture 5.4848477 0.73916458
17 ag4c 33 Agriculture 4.4454332 0.64791409
18 ag4d 36 Agriculture 54.1524979 1.73361849
19 ag4e 35 Agriculture 18.7527709 1.27306545
20 ag4f 25 Agriculture 2.6943908 0.43046059
21 ag5a 37 Agriculture 9.1428019 0.96107931
22 ag5b 19 Agriculture 1.2075253 0.08189623
23 ag5c 37 Agriculture 32.5404276 1.51242326
24 ag5d 33 Agriculture 3.8893382 0.58987571
25 ag5e 31 Agriculture 7.1913356 0.85680956
26 b1a 37 Bauxite 13.3258040 1.12469342
27 b1b 23 Bauxite 1.7291887 0.23784238
28 b1c 16 Bauxite 0.5811953 -0.23567787
29 b1d 27 Bauxite 2.2335325 0.34899229
30 b1e 24 Bauxite 3.4718332 0.54055885
31 b2a 35 Bauxite 8.7560195 0.94230672
32 b2b 17 Bauxite 1.8562012 0.26862505
33 b2c 22 Bauxite 5.1314754 0.71024225
34 b2d 22 Bauxite 2.1336472 0.32912261
35 b2e 12 Bauxite 1.3722959 0.13744776
36 b3a 26 Bauxite 4.3642118 0.63990582
37 b3b 26 Bauxite 3.9374141 0.59521110
38 b3c 11 Bauxite 1.0991556 0.04105919
39 b3d 26 Bauxite 2.1061711 0.32349365
40 b3e 27 Bauxite 5.6320607 0.75066733
41 b3f 11 Bauxite 1.2987237 0.11351676
42 b4a 23 Bauxite 4.2931978 0.63278090
43 b4b 33 Bauxite 10.0659529 1.00285489
44 b4c 20 Bauxite 1.5915311 0.20181513
45 b4d 12 Bauxite 1.4673604 0.16653678
46 b4e 13 Bauxite 0.8040290 -0.09472829
47 Ref1a 36 Forest 5.5000000 0.74036269
48 Ref1b 20 Forest 1.2000000 0.07918125
49 Ref1c 25 Forest 3.5000000 0.54406804
50 Ref1d 20 Forest 1.2000000 0.07918125
51 Ref1e 32 Forest 7.5000000 0.87506126
52 Ref1g 16 Forest 15.0000000 1.17609126
53 Ref2a 34 Forest 3.5000000 0.54406804
54 Ref2b 29 Forest 2.2000000 0.34242268
55 Ref2c 40 Forest 5.5000000 0.74036269
56 Ref2d 29 Forest 2.2000000 0.34242268
57 Ref2e 22 Forest 2.2000000 0.34242268
58 Ref3a 14 Forest 1.2000000 0.07918125
59 Ref3b 29 Forest 5.5000000 0.74036269
60 Ref3c 27 Forest 3.5000000 0.54406804
61 Ref3d 30 Forest 2.2000000 0.34242268
62 Ref4a 23 Forest 1.2000000 0.07918125
63 Ref4b 30 Forest 3.5000000 0.54406804
64 Ref4c 26 Forest 2.2000000 0.34242268
65 Ref4d 26 Forest 2.2000000 0.34242268
66 Ref5a 18 Forest 1.2000000 0.07918125
67 Ref5b 31 Forest 5.5000000 0.74036269
68 Ref5c 28 Forest 5.5000000 0.74036269
69 Ref5d 23 Forest 3.5000000 0.54406804
70 Ref5e 19 Forest 1.2000000 0.07918125
71 Ref6a 18 Forest 1.2000000 0.07918125
72 Ref6b 23 Forest 2.2000000 0.34242268
73 Ref6c 27 Forest 3.5000000 0.54406804
74 Ref6d 26 Forest 3.5000000 0.54406804
75 Ref6e 33 Forest 7.5000000 0.87506126
76 Ref6f 31 Forest 25.0000000 1.39794001
77 u1a 16 Urban 2.7146097 0.43370740
78 u1b 30 Urban 4.6795641 0.67020540
79 u1c 21 Urban 1.6978078 0.22988851
80 u1d 14 Urban 1.4137406 0.15036974
81 u1e 19 Urban 6.8763832 0.83736007
82 u2a 31 Urban 4.6506448 0.66751317
83 u2b 16 Urban 1.4327452 0.15616897
84 u2c 22 Urban 3.2899767 0.51719283
85 u2d 17 Urban 1.2587029 0.09992323
86 u2e 23 Urban 4.4914799 0.65238946
87 u3a 34 Urban 6.7841003 0.83149226
88 u3b 24 Urban 5.1748479 0.71389759
89 u3c 25 Urban 3.1385548 0.49672972
90 u3d 15 Urban 1.3851313 0.14149094
91 u4a 25 Urban 1.8141023 0.25866177
92 u4b 19 Urban 1.0193984 0.00834394
93 u4c 24 Urban 2.9128820 0.46432290
94 u4d 23 Urban 2.6707253 0.42662921
95 u4e 27 Urban 8.3423559 0.92128871
geom_smooth()Defaults to a loess smoother with confidence interval
geom_smooth()geom_smooth(formula = y ~ x, method = "lm")“Standard” linear regression
geom_smooth(formula = y ~ x, method = "lm")se = FALSE
One line through all the points: aes(group = -1)
log_Area common to all levels of Landscape_typeLandscape_type
Call:
lm(formula = n_Species ~ log_Area + Landscape_type, data = BSR)
Residuals:
Min 1Q Median 3Q Max
-18.5508 -2.9911 0.2049 3.1838 10.7961
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 22.270 1.240 17.958 < 2e-16 ***
log_Area 12.271 1.243 9.869 5.35e-16 ***
Landscape_typeBauxite -5.351 1.455 -3.677 0.000401 ***
Landscape_typeForest -2.151 1.321 -1.629 0.106855
Landscape_typeUrban -5.506 1.488 -3.701 0.000369 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 4.84 on 90 degrees of freedom
Multiple R-squared: 0.6043, Adjusted R-squared: 0.5867
F-statistic: 34.36 on 4 and 90 DF, p-value: < 2.2e-16
predict()predict() has methods for many regression-like functions:
predict.lm()predict.glm()predict. and then Tab to see optionsBy default returns the predicted values for observed data.
predict() 1 2 3 4 5 6 7 8
28.96159 19.68346 26.55591 33.53499 23.96166 44.02541 19.16138 30.04315
9 10 11 12 13 14 15 16
38.13711 31.34631 23.20724 28.28542 25.76819 22.20600 22.11547 31.34017
17 18 19 20 21 22 23 24
30.22042 43.54330 37.89177 27.55201 34.06333 23.27471 40.82897 29.50822
25 26 27 28 29 30 31 32
32.78382 30.72001 19.83730 14.02664 21.20124 23.55199 28.48191 20.21504
33 34 35 36 37 38 39 40
25.63421 20.95742 18.60534 24.77110 24.22264 17.42254 20.88834 26.13027
41 42 43 44 45 46 47 48
18.31168 24.68366 29.22491 19.39520 18.96230 15.75626 29.20387 21.09039
49 50 51 52 53 54 55 56
26.79510 21.09039 30.85678 34.55078 26.79510 24.32067 29.20387 24.32067
57 58 59 60 61 62 63 64
24.32067 21.09039 29.20387 26.79510 24.32067 21.09039 26.79510 24.32067
65 66 67 68 69 70 71 72
24.32067 21.09039 29.20387 29.20387 26.79510 21.09039 21.09039 24.32067
73 74 75 76 77 78 79 80
26.79510 26.79510 30.85678 37.27313 22.08610 24.98821 19.58500 18.60921
81 82 83 84 85 86 87 88
27.03939 24.95517 18.68037 23.11056 17.99017 24.76958 26.96739 25.52436
89 90 91 92 93 94 95
22.85946 18.50025 19.93808 16.86638 22.46178 21.99924 28.06930
expand_grid()predict() option newdata allows passing a set of values for which to predict.
log_Area for each level of Landscape_typetidyr function expand_grid() (or crossing())expand_grid()# A tibble: 800 × 2
log_Area Landscape_type
<dbl> <fct>
1 -0.5 Agriculture
2 -0.5 Bauxite
3 -0.5 Forest
4 -0.5 Urban
5 -0.487 Agriculture
6 -0.487 Bauxite
7 -0.487 Forest
8 -0.487 Urban
9 -0.475 Agriculture
10 -0.475 Bauxite
# … with 790 more rows
newdata to predict()# A tibble: 800 × 3
log_Area Landscape_type Predicted
<dbl> <fct> <dbl>
1 -0.5 Agriculture 16.1
2 -0.5 Bauxite 10.8
3 -0.5 Forest 14.0
4 -0.5 Urban 10.6
5 -0.487 Agriculture 16.3
6 -0.487 Bauxite 10.9
7 -0.487 Forest 14.1
8 -0.487 Urban 10.8
9 -0.475 Agriculture 16.4
10 -0.475 Bauxite 11.1
# … with 790 more rows
95% prediction interval
# A tibble: 800 × 6
log_Area Landscape_type Predicted fit lwr upr
<dbl> <fct> <dbl> <dbl> <dbl> <dbl>
1 -0.5 Agriculture 16.1 16.1 5.94 26.3
2 -0.5 Bauxite 10.8 10.8 0.684 20.9
3 -0.5 Forest 14.0 14.0 3.91 24.1
4 -0.5 Urban 10.6 10.6 0.485 20.8
5 -0.487 Agriculture 16.3 16.3 6.11 26.5
6 -0.487 Bauxite 10.9 10.9 0.845 21.0
7 -0.487 Forest 14.1 14.1 4.07 24.2
8 -0.487 Urban 10.8 10.8 0.646 20.9
9 -0.475 Agriculture 16.4 16.4 6.27 26.6
10 -0.475 Bauxite 11.1 11.1 1.01 21.2
# … with 790 more rows
geom_line() adds the mean prediction fitgeom_ribbon() adds the shaded bands bound by lwr and uprgeom_ribbonggplot() +
geom_ribbon(data = Preds,
aes(x = log_Area, ymin = lwr, ymax = upr, fill = Landscape_type),
alpha = 0.1,
show.legend = FALSE) +
geom_line(data = Preds,
aes(x = log_Area, y = fit, color = Landscape_type),
size = 1) +
geom_point(data = BSR,
aes(x = log_Area, y = n_Species, color = Landscape_type)) +
labs(y = "Number of Species", x = "Area (log Ha)") +
scale_color_paletteer_d(`"rcartocolor::Bold"`, name = "Landscape Type") +
theme(legend.position = c(0.8, 0.15))